Method for determining the risk of recurrence of an estrogen receptor-positive and her2-negative primary mammary carcinoma under an endocrine therapy

ABSTRACT

Provided herein are methods for predicting a result relating to breast cancer in a patient, the method comprising (a) determining the RNA expression levels of three or more of the following 8 genes in a tumor sample from the patient: UBE2C, BIRC5, DHCR7, STC2, AZGP1, RBBP8, IL6ST and MGP; and (b) mathematically combining the expression level values for the genes of the mentioned set to obtain a combined score, the combined score indicating a prognosis for the patient, wherein the RNA expression level values have at least in part not been normalized before the mathematical combination.

This application claims priority International Application No.PCT/EP2017/055601, filed Mar. 9, 2017, which claims priority benefit toEP 16159481.7, filed Mar. 9, 2016, the entire contents of which arehereby incorporated by reference.

BACKGROUND

The invention relates to a method for predicting a result relating tobreast cancer in an estrogen receptor-positive and HER2-negative tumorin a breast cancer patient.

The EndoPredict® score (EP score) is a multivariate score fordetermining the risk of remote metastases in patients with an estrogenreceptor-positive and HER2-negative primary mammary carcinoma under asole adjuvant endocrine therapy (Filipits et al. Clin. Cancer Res.17:6012-20 (2011)): A new molecular predictor of distant recurrence inER-positive, HER2-negative breast cancer adds independent information toconventional clinical risk factors. Clinical Cancer Research 17:6012-6020; EP 2 553 118 B1). The EP score is a numerical measure of therelative risk that the tumor of the breast cancer patient examined withthis EP score will develop remote metastases within 10 years. Thedetermined risk thus can be used to support the decision whether breastcancer patients should be treated with chemotherapy, or whether a milderhormone therapy is sufficient as a treatment. Patients with a relativerisk of metastases under an endocrine therapy of more than 10% usuallyundergo chemotherapy. If the risk of metastases is lower, mostphysicians recommend the milder hormone therapy. The present inventionfulfills the need for advanced methods for the prognosis of breastcancer.

SUMMARY

In an embodiment, a method for predicting a result relating to breastcancer in an estrogen receptor-positive and HER2-negative tumor in abreast cancer patient is provided. The method comprises, (a) determiningthe RNA expression levels of at least 4 of the following 8 genes in atumor sample from the patient: UBE2C, BIRC5, DHCR7, STC2, AZGP1, RBBP8,IL6ST and MGP; (b) mathematically combining the expression level valuesfor the genes of the mentioned set, the values having been determined inthe tumor sample, to obtain a combined score, the combined scoreindicating a prognosis for the patient, wherein the RNA expression levelvalues have at least in part not been normalized before the mathematicalcombination. In an embodiment, the at least 4 genes are BIRC5, UBE2C,RBBP8, and IL6ST. In an embodiment, the at least 4 genes are any of thepanels described in Table 1. In an embodiment, said mathematicallycombining the expression levels is effected by using the formula

$= {{\sum\limits_{i = 1}^{8}{{- c_{i}}x_{i}}} - {2,432381}}$ or$= {{\sum\limits_{i = 1}^{k}{{- c_{i}}x_{i}}} + {\left( {20 + \overset{\sim}{r}} \right){\sum\limits_{i = 1}^{k}c_{i}}} + {\sum\limits_{i = {k + 1}}^{8}{{- c_{i}}x_{i}}} + {\left( {20 + r} \right){\sum\limits_{i = {k + i}}^{8}{c_{i}.}}}}$

In an embodiment, said patient has received endocrine therapy or iscontemplated to receive endocrine treatment. In an embodiment, a risk ofdeveloping breast cancer recurrence or cancer-related death ispredicted. In an embodiment, said expression level is determined as aMessenger-RNA expression level. In an embodiment, said expression levelis determined by at least one of a PCR based method, a microarray basedmethod, and a hybridization based method. In an embodiment, saiddetermination of expression levels is in a formalin-fixed paraffinembedded tumor sample or in a fresh-frozen tumor sample. In anembodiment, one, two or more thresholds are determined for said combinedscore, that discriminate into high and low risk, high, intermediate andlow risk, or more risk groups by applying the threshold on the combinedscore. In an embodiment, a high combined score is indicative of benefitfrom cytotoxic chemotherapy. In an embodiment, information regardingnodal status of the patient is processed in the step of mathematicallycombining expression level values for the genes to yield a combinedscore. In an embodiment, said information regarding nodal status is anumerical value if said nodal status is negative and said information isa different numerical value if said nodal status positive and adifferent or identical number if said nodal status is unknown.

In another embodiment, a kit is provided for performing a methodaccording the methods described herein. In an embodiment, said kitcomprising a set of oligonucleotides capable of specifically bindingsequences or to sequences of fragments of the genes in a combination ofgenes, wherein said combination comprises determining the RNA expressionlevels of at least 4 of the following 8 genes in a tumor sample from thepatient: UBE2C, BIRC5, DHCR7, STC2, AZGP1, RBBP8, IL6ST and MGP. In anembodiment, the at least 4 genes of the kit are BIRC5, UBE2C, RBBP8, andIL6ST. In an embodiment, the at least 4 genes are any of the panelsdescribed in Table 1.

In another embodiment, a computer program product is provided. In anembodiment, the computer program product is capable of processing valuesrepresentative of expression levels of a set of genes, mathematicallycombining said values to yield a combined score, wherein said combinedscore is indicative of efficacy from endocrine therapy of said patient,according to any of the methods as described herein.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 shows the deviation of EP scores generated by the alternativealgorithm where BIRC5, AZGP1, and STC2 are not normalized. The graphillustrates a comparison of the alternative algorithm of the Exampledescribed herein from the EP score generated by the original EP scorealgorithm described in EP2553118B1. The original algorithm from the Yaxis is dependent on the amount of input RNA as determined by the meanCt value of the housekeeping genes as displayed on the X axis.

FIG. 2 shows the deviation of EP scores generated by the alternativealgorithm where BIRC5, IL6ST, and STC2 are not normalized. The graphillustrates a comparison of the alternative algorithm of the Exampledescribed herein from the EP score generated by the original EP scorealgorithm described in EP2553118B1. The original algorithm from the Yaxis is dependent on the amount of input RNA as determined by the meanCt value of the housekeeping genes as displayed on the X axis.

FIG. 3 shows the deviation of EP scores generated by the alternativealgorithm where IL6ST, DHCR7, and STC2 are not normalized. The graphillustrates a comparison of the alternative algorithm of the Exampledescribed herein from the EP score generated by the original EP scorealgorithm described in EP2553118B1. The original algorithm from the Yaxis is dependent on the amount of input RNA as determined by the meanCt value of the housekeeping genes as displayed on the X axis.

FIG. 4 shows the deviation of EP scores generated by the alternativealgorithm where all eight EP genes are not normalized. The graphillustrates a comparison of the alternative algorithm of the Exampledescribed herein from the EP score generated by the original EP scorealgorithm described in EP2553118B1. The original algorithm from the Yaxis is dependent on the amount of input RNA as determined by the meanCt value of the housekeeping genes as displayed on the X axis.

DETAILED DESCRIPTION Definitions

Unless defined otherwise, technical and scientific terms used hereinhave the same meaning as commonly understood by one of ordinary skill inthe art to which this invention belongs.

The term “cancer” refers to uncontrolled cellular growth, and is notlimited to any stage, grade, histomorphological feature, agressivity, ormalignancy of an affected tissue or cell aggregation.

The term “predicting an outcome” of a disease, as used herein, is meantto include both a prediction of an outcome of a patient undergoing agiven therapy and a prognosis of a patient who is not treated. The term“predicting an outcome” may, in particular, relate to the risk of apatient developing metastasis, local recurrence or death.

The term “prediction”, as used herein, relates to an individualassessment of the malignancy of a tumor, or to the expected survivalrate (OAS, overall survival or DFS, disease free survival) of a patient,if the tumor is treated with a given therapy. In contrast thereto, theterm “prognosis” relates to an individual assessment of the malignancyof a tumor, or to the expected survival rate (OAS, overall survival orDFS, disease free survival) of a patient, if the tumor remainsuntreated.

An “outcome” within the meaning of the present invention is a definedcondition attained in the course of the disease. This disease outcomemay e.g. be a clinical condition such as “recurrence of disease”,“development of metastasis”, “development of nodal metastasis”,development of distant metastasis”, “survival”, “death”, “tumorremission rate”, a disease stage or grade or the like.

A “risk” is understood to be a number related to the probability of asubject or a patient to develop or arrive at a certain disease outcome.The term “risk” in the context of the present invention is not meant tocarry any positive or negative connotation with regard to a patient'swellbeing but merely refers to a probability or likelihood of anoccurrence or development of a given condition.

The term “clinical data” relates to the entirety of available data andinformation concerning the health status of a patient including, but notlimited to, age, sex, weight, menopausal/hormonal status, etiopathologydata, anamnesis data, data obtained by in vitro diagnostic methods suchas histopathology, blood or urine tests, data obtained by imagingmethods, such as x-ray, computed tomography, MRI, PET, spect,ultrasound, electrophysiological data, genetic analysis, gene expressionanalysis, biopsy evaluation, intraoperative findings.

The term “node positive”, “diagnosed as node positive”, “nodeinvolvement” or “lymph node involvement” means a patient havingpreviously been diagnosed with lymph node metastasis. It shall encompassboth draining lymph node, near lymph node, and distant lymph nodemetastasis. This previous diagnosis itself shall not form part of theinventive method. Rather it is a precondition for selecting patientswhose samples may be used for one embodiment of the present invention.This previous diagnosis may have been arrived at by any suitable methodknown in the art, including, but not limited to lymph node removal andpathological analysis, biopsy analysis, in-vitro analysis of biomarkersindicative for metastasis, imaging methods (e.g. computed tomography,X-ray, magnetic resonance imaging, ultrasound), and intraoperativefindings.

In the context of the present invention a “biological sample” is asample which is derived from or has been in contact with a biologicalorganism. Examples for biological samples are: cells, tissue, bodyfluids, lavage fluid, smear samples, biopsy specimens, blood, urine,saliva, sputum, plasma, serum, cell culture supernatant, and others.

A “tumor sample” is a biological sample containing tumor cells, whetherintact or degraded. The sample may be of any biological tissue or fluid.Such samples include, but are not limited to, sputum, blood, serum,plasma, blood cells (e.g., white cells), tissue, core or fine needlebiopsy samples, cell-containing body fluids, urine, peritoneal fluid,and pleural fluid, liquor cerebrospinalis, tear fluid, or cells isolatedtherefrom. This may also include sections of tissues such as frozen orfixed sections taken for histological purposes or microdissected cellsor extracellular parts thereof. A tumor sample to be analyzed can betissue material from a neoplastic lesion taken by aspiration orpunctuation, excision or by any other surgical method leading to biopsyor resected cellular material. Such comprises tumor cells or tumor cellfragments obtained from the patient. The cells may be found in a cell“smear” collected, for example, by a nipple aspiration, ductal lavage,fine needle biopsy or from provoked or spontaneous nipple discharge. Inanother embodiment, the sample is a body fluid. Such fluids include, forexample, blood fluids, serum, plasma, lymph, ascitic fluids, gynecologicfluids, or urine but not limited to these fluids.

A “gene” is a set of segments of nucleic acid that contains theinformation necessary to produce a functional RNA product. A “geneproduct” is a biological molecule produced through transcription orexpression of a gene, e.g., an mRNA, cDNA or the translated protein.

An “mRNA” is the transcribed product of a gene and shall have theordinary meaning understood by a person skilled in the art. A “moleculederived from an mRNA” is a molecule which is chemically or enzymaticallyobtained from an mRNA template, such as cDNA.

The term “expression level” refers to a determined level of geneexpression. This may be a determined level of gene expression as anabsolute value or compared to a reference gene (e.g. a housekeepinggene), to the average of two or more reference genes, or to a computedaverage expression value (e.g. in DNA chip analysis) or to anotherinformative gene without the use of a reference sample. The expressionlevel of a gene may be measured directly, e.g. by obtaining a signalwherein the signal strength is correlated to the amount of mRNAtranscripts of that gene or it may be obtained indirectly at a proteinlevel, e.g., by immunohistochemistry, CISH, ELISA or RIA methods. Theexpression level may also be obtained by way of a competitive reactionto a reference sample. An expression value which is determined bymeasuring some physical parameter in an assay, e.g. fluorescenceemission, may be assigned a numerical value which may be used forfurther processing of information.

A “reference pattern of expression levels” within the meaning of theinvention shall be understood as being any pattern of expression levelsthat can be used for the comparison to another pattern of expressionlevels. In a preferred embodiment of the invention, a reference patternof expression levels is, e.g., an average pattern of expression levelsobserved in a group of healthy individuals, diseased individuals, ordiseased individuals having received a particular type of therapy,serving as a reference group, or individuals with good or bad outcome.

The term “mathematically combining expression levels”, within themeaning of the invention shall be understood as deriving a numeric valuefrom a determined expression level of a gene and applying an algorithmto one or more of such numeric values to obtain a combined numericalvalue or combined score.

An “algorithm” is a process that performs some sequence of operations toproduce information.

A “score” is a numeric value that was derived by mathematicallycombining expression levels using an algorithm. It may also be derivedfrom expression levels and other information, e.g. clinical data. Ascore may be related to the outcome of a patient's disease. AnEndoPredict® score (EP score) is a multivariate score for determiningthe risk of remote metastases in patients with an estrogenreceptor-positive and HER2-negative primary mammary carcinoma under asole adjuvant endocrine therapy. The EP score is a numerical measure ofthe relative risk that the tumor of the breast cancer patient examinedwith this EP score will develop remote metastases within 10 years.

A “discriminant function” is a function of a set of variables used toclassify an object or event. A discriminant function thus allowsclassification of a patient, sample or event into a category or aplurality of categories according to data or parameters available fromsaid patient, sample or event. Such classification is a standardinstrument of statistical analysis well known to the skilled person. Forexample, a patient may be classified as “high risk” or “low risk”, “highprobability of metastasis” or “low probability of metastasis”, “in needof treatment” or “not in need of treatment” according to data obtainedfrom said patient, sample or event. Classification is not limited to“high vs. low”, but may be performed into a plurality of categories,grading or the like. Classification shall also be understood in a widersense as a discriminating score, where e.g. a higher score represents ahigher likelihood of distant metastasis, e.g., the (overall) risk of adistant metastasis. Examples for discriminant functions which allow aclassification include, but are not limited to functions defined bysupport vector machines (SVM), k-nearest neighbors (kNN), (naive) Bayesmodels, linear regression models or piecewise defined functions such as,for example, in subgroup discovery, in decision trees, in logicalanalysis of data (LAD) and the like. In a wider sense, continuous scorevalues of mathematical methods or algorithms, such as correlationcoefficients, projections, support vector machine scores, othersimilarity-based methods, combinations of these and the like areexamples for illustrative purpose.

The term “therapy modality”, “therapy mode”, “regimen” as well as“therapy regimen” refers to a timely sequential or simultaneousadministration of anti-tumor, and/or anti vascular, and/or immunestimulating, and/or blood cell proliferative agents, and/or radiationtherapy, and/or hyperthermia, and/or hypothermia for cancer therapy. Theadministration of these can be performed in an adjuvant and/orneoadjuvant mode. The composition of such “protocol” may vary in thedose of the single agent, timeframe of application and frequency ofadministration within a defined therapy window. Currently variouscombinations of various drugs and/or physical methods, and variousschedules are under investigation.

The term “cytotoxic chemotherapy” refers to various treatment modalitiesaffecting cell proliferation and/or survival. The treatment may includeadministration of alkylating agents, antimetabolites, anthracyclines,plant alkaloids, topoisomerase inhibitors, and other antitumor agents,including monoclonal antibodies and kinase inhibitors. In particular,the cytotoxic treatment may relate to a taxane treatment. Taxanes areplant alkaloids which block cell division by preventing microtubulefunction. The prototype taxane is the natural product paclitaxel,originally known as Taxol and first derived from the bark of the PacificYew tree. Docetaxel is a semi-synthetic analogue of paclitaxel. Taxanesenhance stability of microtubules, preventing the separation ofchromosomes during anaphase.

The term “endocrine treatment” or “hormonal treatment” (sometimes alsoreferred to as “anti-hormonal treatment”) denotes a treatment whichtargets hormone signaling, e.g. hormone inhibition, hormone receptorinhibition, use of hormone receptor agonists or antagonists, use ofscavenger- or orphan receptors, use of hormone derivatives andinterference with hormone production. Particular examples are tamoxifenetherapy which modulates signaling of the estrogen receptor, or aromatasetreatment which interferes with steroid hormone production.

Tamoxifen is an orally active selective estrogen receptor modulator(SERM) that is used in the treatment of breast cancer and is currentlythe world's largest selling drug for that purpose. Tamoxifen is soldunder the trade names Nolvadex, Istubal, and Valodex. However, the drug,even before its patent expiration, was and still is widely referred toby its generic name “tamoxifen.” Tamoxifen and Tamoxifen derivativescompetitively bind to estrogen receptors on tumors and other tissuetargets, producing a nuclear complex that decreases RNA synthesis andinhibits estrogen effects.

Steroid receptors are intracellular receptors (typically cytoplasmic)that perform signal transduction for steroid hormones. Examples includetype I Receptors, in particular sex hormone receptors, e.g. androgenreceptor, estrogen receptor, progesterone receptor; Glucocorticoidreceptor, mineralocorticoid receptor; and type II Receptors, e.g.vitamin A receptor, vitamin D receptor, retinoid receptor, thyroidhormone receptor.

The term “hybridization-based method”, as used herein, refers to methodsimparting a process of combining complementary, single-stranded nucleicacids or nucleotide analogues into a single double stranded molecule.Nucleotides or nucleotide analogues will bind to their complement undernormal conditions, so two perfectly complementary strands will bind toeach other readily. In bioanalytics, very often labeled, single strandedprobes are used in order to find complementary target sequences. If suchsequences exist in the sample, the probes will hybridize to saidsequences which can then be detected due to the label. Otherhybridization based methods comprise microarray and/or biochip methods.Therein, probes are immobilized on a solid phase, which is then exposedto a sample. If complementary nucleic acids exist in the sample, thesewill hybridize to the probes and can thus be detected. These approachesare also known as “array based methods.” Yet another hybridization basedmethod is PCR, which is described below. When it comes to thedetermination of expression levels, hybridization based methods may forexample be used to determine the amount of mRNA for a given gene.

An oligonucleotide capable of specifically binding sequences a gene orfragments thereof relates to an oligonucleotide which specificallyhybridizes to a gene or gene product, such as the gene's mRNA or cDNA orto a fragment thereof. To specifically detect the gene or gene product,it is not necessary to detect the entire gene sequence. A fragment ofabout 20-150 bases will contain enough sequence specific information toallow specific hybridization.

The term “a PCR based method” as used herein refers to methodscomprising a polymerase chain reaction (PCR). This is a method ofexponentially amplifying nucleic acids, e.g. DNA by enzymaticreplication in vitro. As PCR is an in vitro technique, it can beperformed without restrictions on the form of DNA, and it can beextensively modified to perform a wide array of genetic manipulations.When it comes to the determination of expression levels, a PCR basedmethod may for example be used to detect the presence of a given mRNA by(1) reverse transcription of the complete mRNA pool (the so calledtranscriptome) into cDNA with help of a reverse transcriptase enzyme,and (2) detecting the presence of a given cDNA with help of respectiveprimers. This approach is commonly known as reverse transcriptase PCR(rtPCR). Moreover, PCR-based methods comprise e.g. real time PCR, and,particularly suited for the analysis of expression levels, kinetic orquantitative PCR (qPCR).

The term “Quantitative PCR” (qPCR)” refers to any type of a PCR methodwhich allows the quantification of the template in a sample.Quantitative real-time PCR comprise different techniques of performanceor product detection as for example the TaqMan technique or theLightCycler technique. The TaqMan technique, for examples, uses adual-labelled fluorogenic probe. The TaqMan real-time PCR measuresaccumulation of a product via the fluorophore during the exponentialstages of the PCR, rather than at the end point as in conventional PCR.The exponential increase of the product is used to determine thethreshold cycle, CT, e.g., the number of PCR cycles at which asignificant exponential increase in fluorescence is detected, and whichis directly correlated with the number of copies of DNA template presentin the reaction. The set up of the reaction is very similar to aconventional PCR, but is carried out in a real-time thermal cycler thatallows measurement of fluorescent molecules in the PCR tubes. Differentfrom regular PCR, in TaqMan real-time PCR a probe is added to thereaction, e.g., a single-stranded oligonucleotide complementary to asegment of 20-60 nucleotides within the DNA template and located betweenthe two primers. A fluorescent reporter or fluorophore (e.g.,6-carboxyfluorescein, acronym: FAM, or tetrachlorofluorescein, acronym:TET) and quencher (e.g., tetramethylrhodamine, acronym: TAMRA, ofdihydrocyclopyrroloindole tripeptide ‘black hole quencher’, acronym:BHQ) are covalently attached to the 5′ and 3′ ends of the probe,respectively. The close proximity between fluorophore and quencherattached to the probe inhibits fluorescence from the fluorophore. DuringPCR, as DNA synthesis commences, the 5′ to 3′ exonuclease activity ofthe Taq polymerase degrades that proportion of the probe that hasannealed to the template. Degradation of the probe releases thefluorophore from it and breaks the close proximity to the quencher, thusrelieving the quenching effect and allowing fluorescence of thefluorophore. Hence, fluorescence detected in the real-time PCR thermalcycler is directly proportional to the fluorophore released and theamount of DNA template present in the PCR.

By “array” or “matrix” an arrangement of addressable locations or“addresses” on a device is meant. The locations can be arranged in twodimensional arrays, three dimensional arrays, or other matrix formats.The number of locations can range from several to at least hundreds ofthousands. Most importantly, each location represents a totallyindependent reaction site. Arrays include but are not limited to nucleicacid arrays, protein arrays and antibody arrays. A “nucleic acid array”refers to an array containing nucleic acid probes, such asoligonucleotides, nucleotide analogues, polynucleotides, polymers ofnucleotide analogues, morpholinos or larger portions of genes. Thenucleic acid and/or analogue on the array is preferably single stranded.Arrays wherein the probes are oligonucleotides are referred to as“oligo¬nucleotide arrays” or “oligonucleotide chips.” A “microarray,”herein also refers to a “biochip” or “biological chip”, an array ofregions having a density of discrete regions of at least about 100/cm2,and preferably at least about 1000/cm2.

“Primer pairs” and “probes” within the meaning of the invention shallhave the ordinary meaning of this term which is well known to the personskilled in the art of molecular biology. In a preferred embodiment ofthe invention “primer pairs” and “probes” shall be understood as beingpolynucleotide molecules having a sequence identical, complementary,homologous, or homologous to the complement of regions of a targetpolynucleotide which is to be detected or quantified. In yet anotherembodiment, nucleotide analogues are also comprised for usage as primersand/or probes. Probe technologies used for kinetic or real time PCRapplications could be e.g. TaqMan® systems obtainable at AppliedBiosystems, extension probes such as Scorpion® Primers, DualHybridisation Probes, Amplifluor® obtainable at Chemicon International,Inc, or Minor Groove Binders.

“Individually labeled probes”, within the meaning of the invention,shall be understood as being molecular probes comprising apolynucleotide, oligonucleotide or nucleotide analogue and a label,helpful in the detection or quantification of the probe. Preferredlabels are fluorescent molecules, luminescent molecules, radioactivemolecules, enzymatic molecules and/or quenching molecules.

“Arrayed probes”, within the meaning of the invention, shall beunderstood as being a collection of immobilized probes, preferably in anorderly arrangement. In a preferred embodiment of the invention, theindividual “arrayed probes” can be identified by their respectiveposition on the solid support, e.g., on a “chip”.

When used in reference to a single-stranded nucleic acid sequence, theterm “substantially homologous” refers to any probe that can hybridize(i.e., it is the complement of) the single-stranded nucleic acidsequence under conditions of low stringency as described above.

To determine an EP score, the relative RNA expression of eight genes ismeasured, and their measured values are used for calculation by means ofa discriminate function. The RNA expression can be determined with anytechnical method suitable for quantifying RNA. Because of its highanalytical sensitivity and the possibility to analyze even small RNAfragments obtained in the recovery of tumor RNA from formalin-fixed andparaffin-embedded breast cancer tissue, the quantitative polymerasechain reaction with previous reverse transcription (RT-qPCR) is asuitable technical mode for performing the analysis. However, microarrayanalysis or RNA sequencing are equally suitable for determining an EPscore. The EndoPredict® score and the necessary technical method fordetermining it is described in Filipits et al. (2011) and in EP 2 553118, both of which are incorporated herein by reference.

For the described calculation of the EP score, the measured values ofthe mRNA expression of a total of 11 genes are used. Among these, eightare so-called informative genes, whose expression level in combinationcorrelates with the further course of the disease. The three remaininggenes are reference genes, sometimes referred to as “normalizationgenes”.

The measured value obtained upon performing RT-qPCR, which inverselycorrelates with the quantity of RNA present in the analyzed sample, isthe Ct value. It indicates after how many amplification cycles asufficient amount of the PCR probe has been enzymatically degraded, sothat the thus achieved reduction of the fluorescence quenching of thePCR dye by the PCR quencher is sufficient to be able to measure thefluorescence of the PCR dye. Therefore, a high Ct value in RT-qPCR is anindicator of a small amount of RNA to be analyzed in a sample.

The level of the Ct value depends on the concentration of the analyzedRNA in the sample, and also primarily on the total amount of RNA in thesample. However, especially in the analysis of a tissue sample, it isdifficult to precisely define the amount of analyzed tissue and thus tobe able to calculate a concentration in the tissue. This is mainlybecause tissues are mostly heterogeneous. The water content above all,but also the lipid content or the proportion of non-cellular components,can vary significantly. Thus, variations in the analysis of the RNAamounts of different genes in human or animal tissue often ratherreflect the variation of the amount of the cellular fraction of thetissue subjected to in the analysis than the actually interestingbiological differences between different tissue samples. In addition,the result of an RNA quantification is often substantially affected bythe integrity of the RNA to be analyzed and by the amplificationefficiency of the reagents employed. Therefore, the Ct values obtainedin the RNA analysis of tissue are often primarily the product ofdifferent experimental factors, and to a lesser extent caused by theactually examined biological differences between the analyzed samples.Thus, if it is desired to measure the concentration of RNA in the cellsof a tissue sample, the Ct value as a raw measured value of RT-qPCR isusually unsuitable.

Therefore, in order to be able to compare the RNA concentrations in twodifferent tissue samples in a reasonable way, the Ct values must alwaysbe normalized on the basis of an invariant reference quantity. Theobvious approach would be to normalize the Ct value on the basis of aparticular amount of tissue, for example, one milligram or onemicrogram. However, because of the heterogeneity of the tissue, thismethod is practicable only to a very limited degree and is rarely used.The most common method in RT-qPCR is the normalization of the Ct valuesof the analyzed RNA transcripts (genes of interest or GOI) on the basisof the Ct value of one or more other, invariant genes in the samesample. These invariant genes are mostly referred to as reference ornormalization genes, sometimes also as “housekeeper genes.” Theinvariance of the RNA expression of the normalization gene under themeasuring conditions is the primary requirement demanded of anormalization gene. A variability of the amount of the RNA transcript ofthe normalization gene would reduce the purpose of normalization. Avariant normalization gene has the consequence that the allegedly“normalized” Ct value of a “gene of interest” is actually notnormalized. In this case, it depends on factors other than thetranscript concentration of the gene of interest. Therefore, thenormalization of a “gene of interest” using a variant gene or thecorrespondingly variant average of several non-variant genes is not anormalization at all, because the correspondingly formed “two-generatio” does not allow conclusions to be made on the transcript quantityof the “gene of interest.”

Because the invariance of a single gene is often difficult to ensure,the expression level of the RNA of several reasonably invariant genesare averaged in practice, expecting that the average of these genesexhibits a lower biological variance than that of the RNA concentrationof each individual normalization gene.

An alternative normalization method is to average the RNA expressionlevel of a large number of genes, including genes known to be variant,expecting that the average of the variance of the expression of thesemany genes will cancel out from examined sample to examined sample, andthat the average of the expression of these genes will therefore beequal in all examined samples. This method of normalization is sometimesreferred to as “global scaling.”

In any event, the RNA quantity of the “gene of interest” is expressedrelative to the RNA quantity of one invariant gene, to the average ofthe RNA quantities of some invariant genes, or to the average of a largenumber of arbitrarily chosen genes. This is usually done by dividing theRNA quantity of the “gene of interest” by the quantity of RNA of thereference gene, or by the average of the RNA quantities of the referencegenes. Because there is a logarithmic relationship between the Ct valueand the RNA quantity, the normalization is then performed by subtractingthe Ct values. This method is referred to as a delta-CT method. Thenormalized Ct value obtained is usually referred to as a delta-CT value.

In this way, the described EP score is calculated in two steps from theCt values of the RNA molecules measured for the determination of the EPscore: at first, the eight informative genes are normalized against theaverage of three invariant reference genes, and then the delta-Ct valuesof the eight informative genes are linearly combined.

A consequence of this approach is the fact that the transcriptquantities of a total of 11 genes must be analyzed for determining theEndoPredict® score (EOP score) consisting of 8 genes. Thus, about aquarter of the cost and expenses of the determination of theEndoPredict® score is required for the determination of the transcriptsnecessary for normalizing the measured values. Thus, it is the object ofthe present invention to provide a method for determining the EP scoresimply but reliably without having to determine the RNA quantity ofnormalization genes.

According to the invention, this object is achieved by a method forpredicting a result relating to breast cancer in an estrogenreceptor-positive and HER2-negative tumor in a breast cancer patient,the method comprising:

(a) determining the RNA expression levels of four or more of thefollowing 8 genes in a tumor sample from the patient: UBE2C, BIRC5,DHCR7, STC2, AZGP1, RBBP8, IL6ST and MGP;(b) mathematically combining the expression level values for the genesof the mentioned set, the values having been determined in the tumorsample, to obtain a combined score, the combined score indicating aprognosis for the patient, wherein the RNA expression levels have atleast in part not been normalized before the mathematical combination.

In some embodiments the four or more genes are BIRC5, UBE2C, RBBP8, andIL6ST. Additional embodiments of the four of more genes can include anyof the biomarker panels described in Table 1.

TABLE 1 Panel 1 BIRC5, UBE2C, RBBP8, and IL6ST Panel 2 BIRC5, UBE2C,RBBP8, IL6ST, and DHCR7 Panel 3 BIRC5, UBE2C, RBBP8, IL6ST, and AZGP1Panel 4 BIRC5, UBE2C, RBBP8, IL6ST, and MGP Panel 5 BIRC5, UBE2C, RBBP8,IL6ST, and STC2 Panel 6 BIRC5, UBE2C, RBBP8, IL6ST, DHCR7, and AZGP1Panel 7 BIRC5, UBE2C, RBBP8, IL6ST, DHCR7, and MGP Panel 8 BIRC5, UBE2C,RBBP8, IL6ST, DHCR7, and STC2 Panel 9 BIRC5, UBE2C, RBBP8, IL6ST, AZGP1,and MGP Panel 10 BIRC5, UBE2C, RBBP8, IL6ST, AZGP1, and STC2 Panel 11BIRC5, UBE2C, RBBP8, IL6ST, MGP, and STC2 Panel 12 BIRC5, UBE2C, RBBP8,IL6ST, DHCR7, AZGP1, and MGP Panel 13 BIRC5, UBE2C, RBBP8, IL6ST, DHCR7,AZGP1, and STC Panel 14 BIRC5, UBE2C, RBBP8, IL6ST, DHCR7, MGP, and STCPanel 15 BIRC5, UBE2C, RBBP8, IL6ST, AZGP1, MGP, and STC Panel 16 BIRC5,UBE2C, RBBP8, IL6ST, DHCR7, AZGP1, MGP, and STC

It is not always optimal to normalize the RNA quantity (transcriptquantity), i.e., the Ct value, of a “gene of interest” on the basis ofthe RNA quantity of another “gene of interest” or of the average of someor all “genes of interest.” The transcript quantities of the “genes ofinterest” are of course highly different among the samples because thegenes in the EP score were purposefully selected to reflect thebiological variance of different samples. However, to relate a varianttranscript quantity to another variant transcript quantity might not beexpedient, as described above, because this still would not allow one tocompare transcript quantities of a “gene of interest” among the samples.

As a result, the measurement of genes in addition to the eight “genes ofinterest” in the EP score can be omitted only if the normalization ofthe “genes of interest” can be successfully dispensed with altogether.

The method according to the invention is based on the fact that the Ctvalues, which, are raw values, do not exclusively reflect the RNAquantities of the genes determined for the EP score, as described above,nevertheless are not normalized, and also remain unnormalized in thefurther course of the calculation of the EP score. Then, thecomparability of different EP scores determined on different tumorsamples is accordingly not obtained by normalizing the Ct values of thegenes from which the EP score is calculated, making them comparable, butthe comparability is advantageously reached on the level of the EPscore.

This is further explained by means of the following technical measure:

The eight genes of interest of the EP score are first normalized on thebasis of the average of three reference genes, and the EP score isrepresented as a linear combination of the total of 11 measured Ctvalues according to equation (3) (see below). Surprisingly, when themethod according to the invention is applied to the EndoPredict® method,in particular, it results that the sum of the linear coefficients of theeight “genes of interest” according to equation (6) is relatively small,so that the corresponding term can therefore be neglected as a goodapproximation. A new EP score is obtained (equation (8)), which,although not identical with previous, conventionally calculated scores(Filipits et al.), deviates only slightly therefrom and does notdeteriorate the prognostic value of the assay, thus being clinicallyirrelevant. An advantage of the method according to the invention is thefact that no reference genes need to be measured for calculating the newEP score: this simplifies the production of test kits (PCR primers andprobes) and the performance of the test on the user's part.

Indeed, the individual transcript amounts of the individual genes are nolonger normalized in the method according to the invention. Therefore,normalized expression levels are no longer derivable even within thecalculation of the EP score. Thus, the comparability of different EPscores from different samples is no longer derived from thecomparability of the Ct values (these are actually not comparable amongthe samples), but from the fact that the sum of the coefficients usedfor the linear combination of the Ct values is not substantiallydifferent from zero. As a consequence, although the measurement of oneand the same tissue sample may yield significantly different raw Ctvalues of all individual genes because of different starting quantitiesand different RNA qualities, the sum of all these weighted individualgenes is nevertheless essentially constant. For this reason, a new EPscore that is well comparable among the samples is obtained despite alack of normalization of the individual genes.

The normalization-free calculation of the EP score cannot be derivedmathematically from the already published calculation of the EP scorewith normalization. This is because the two kinds of calculation are notequivalent. Especially in EndoPredict®, the possibility to dispense withmeasuring the normalization genes results from the fact that the sum ofthe coefficients on the linear combination of the delta Ct values is notlarge, because the terms are in part positive and in part negativenumbers. Thus, setting this sum to zero is a mistake in strictlymathematical terms. However, the produced mistake is small andacceptable especially before the background of the imprecision of themeasured values. However, it allows a greatly simplified and yetreliable determination in the specific case of the EP score.

The first step in the calculation of the EP score is the determinationof delta-Ct values. The following definition is used:

Δ_(i)=20−x _(i) +r  (1)

In this equation, Δ_(i) is the delta-Ct value of the “gene of interest”i, x_(i) is the Ct value of gene i, and r is the average of the Ctvalues of the three reference genes. The EP score uses eight informativegenes (BIRC5, RBBP8, UBE2C, IL6ST, AZGP1, DHCR7, MGP and STC2) and threereference genes (CALM2, OAZ1 and RPL37A).

In the second step, the eight delta-Ct values are calculated into onescore.

$\begin{matrix}{{EP} = {\sum\limits_{i = 1}^{8}{c_{i}\Delta_{i}}}} & (2)\end{matrix}$

Herein, EP is the (unscaled) EP score, and c_(i) is the linearcoefficient for the informative gene i. As already published byFilipits, the linear coefficients are:

TABLE 2 i Gene name c_(i) 1 BIRC5 0.407753 2 RBBP8 −0.347558 3 UBE2C0.388326 4 IL6ST −0.305020 5 AZGP1 −0.264064 6 DHCR7 0.394019 7 MGP−0.183334 8 STC2 −0.146689

The third and last step of the calculation of the EP score consists in ascaling and limiting step. However, it is not relevant to the result andmerely transfers the results to a more intuitive scale. This step willbe ignored in the further considerations.

In order to calculate EP directly from the Ct values x_(i), equation (1)is substituted into equation (2) to obtain equation (3).

$\begin{matrix}{{EP} = {\sum\limits_{i = 1}^{8}{c_{i}\left( {20 - x_{i} + r} \right)}}} & (3)\end{matrix}$

Now, the Ct values of the informative genes x₁, . . . , x₈ can beseparated from the average of the Ct values of the reference genes r byfactoring:

$\begin{matrix}{{EP} = {{\sum\limits_{i = 1}^{8}{{- c_{i}}x_{i}}} + {\left( {20 + r} \right){\sum\limits_{i = 1}^{8}c_{i}}}}} & (4)\end{matrix}$

The second factor in the second addend can be calculated with the aid ofTable 2.

$\begin{matrix}{{\sum\limits_{i = 1}^{8}c_{i}} = {- 0.056567}} & (5)\end{matrix}$

Thus, in the special case of the coefficients in EndoPredict®, theabsolute value of this sum is relatively small (significantly smallerthan any of its addends) and therefore, as a special case, allows thefollowing surprising approximation of a new EP score:

$\begin{matrix}{= {{\sum\limits_{i = 1}^{8}{{- c_{i}}x_{i}}} + {\left( {20 + \overset{\sim}{r}} \right){\sum\limits_{i = 1}^{8}c_{i}}}}} & (6)\end{matrix}$

Here, only two variables were replaced as compared to equation (4):

designates the new approximated EP score, and r designates a constant,which is determined below. In particular, r (unlike r) is not dependenton measured values of the patient sample in question.

Now, after the definition of the approximated EP score according toequation (6), what is interesting above all is the difference betweenthe new EP score and the previous EP score according to equation (4). Itis obtained by subtracting equations (6) and (4) to give equation (7).

$\begin{matrix}{{- {EP}} = {{\left( {\overset{\sim}{r} - r} \right){\sum\limits_{i = 1}^{8}c_{i}}} = {{- 0.056567} \cdot \left( {\overset{\sim}{r} - r} \right)}}} & (7)\end{matrix}$

On the basis of this equation, it is clear that the alteration of thescore, i.e.,

−EP, can be kept small if the constant r is selected to be close to r.

Empirical studies showed that r is typically within the interval of from19 to 27. This value results from the RNA quantity that can typically beisolated from a tumor sample. In practice, a value of from r=21 to 25,preferably r=23, suggests itself for r. Thus, |r−r|≤4 would apply, andthe deviation |

−EP|≤0.226270 would be acceptably small (this means a maximum variationof 0.339406 for the EP score scaled according to Filipits et al.; thisvalue is thus smaller than half the width of the 95% confidence intervalof the measuring accuracy of about 0.5). In accordance with the aboveand because of the small value of the sum over c_(i), there is obtainedas an approximation for the calculation of the EP score according toequation (6):

$\begin{matrix}{\overset{\_}{EP} = {{\sum\limits_{i = 1}^{8}{{- c_{i}}x_{i}}} - 2.432381}} & (8)\end{matrix}$

Thus, an approximated form of the EP score, which is not completelyinvariant towards variations of the RNA input amount in accordance withthe omission of normalization, can actually be derived according toequation (8). However, it allows a clearly simpler performing of thetest. Because of the omission of normalization, 3 of the 11 RNAmeasurements can be omitted. Thus, because of the reduced number ofmeasurements necessary for the determination of the EP score, theoverall precision of the measurement and thus the repeatability of theoverall result is also improved.

From the disclosure, it can be seen that it is not only possible toperform an approximate calculation of the EP score according to equation(8) by normalizing none of the RNA expression levels of any gene. It isalso possible to calculate part of an approximate EP score from thenormalized value of the RNA expression of some genes by analogy withequation (3), and to calculate some other part of the EP score from theunnormalized RNA expression levels of the remaining genes by analogywith equation (6) according to equation (9):

$\begin{matrix}{= {{\sum\limits_{i = 1}^{k}{{- c_{i}}x_{i}}} + {\left( {20 + \overset{\sim}{r}} \right){\sum\limits_{i = 1}^{k}c_{i}}} + {\sum\limits_{i = {k + 1}}^{8}{{- c_{i}}x_{i}}} + {\left( {20 + r} \right){\sum\limits_{i = {k + 1}}^{8}c_{i}}}}} & (9)\end{matrix}$

wherein k must be a natural number from 1 to 6. Further, it is importantfor the genes whose measuring results are included in the modified EPscore without normalization to be selected in such a way that theabsolute value of the sum of linear coefficients c_(i) corresponding tosuch genes according to Table 2 is as low as possible, preferably lowerthan 0.06. Thus, suitable gene combinations that can be included in themodified EP score without normalization are, for example, BIRC5, AZGP1,STC2 (sum over c_(i) equals −0.003) or BIRC5 and IL6ST and STC2 (sumover c_(i) equals −0.043956) or IL6ST and DHCR7 and STC2 (sum over c_(i)equals −0.05769). The respectively remaining genes of the EP score wouldthen be included in the modified EP score in an individually normalizedform.

Absolute coefficients are thus for proliferation genes: BIRC5(coefficient: 0.41), UBE2C (0.39), DHCR7 (0.39) and differentiation/ERsignalling genes: RBBP8 (0.35), IL6ST (0.31), AZGP1 (0.26), MGP (0.18),STC2 (0.15).

Example

Aspects of the present teachings can be further understood in light ofthe following examples, which should not be construed as limiting thescope of the present teachings in any way.

This example demonstrates the ability to determine an EndoPredict® EPscore (an “EP score”) either without having to determine the RNAquantity of normalization genes, or by determining RNA quantities usingpartial normalization.

Total RNA was extracted from 881 samples of patients with ER+, HER2−primary breast cancer samples was extracted with a Siemens, silicabead-based and fully automated isolation method for RNA from one 10 μmwhole FFPE tissue section on a Hamilton MICROLAB STARlet liquid handlingrobot (17). The robot, buffers and chemicals were part of a SiemensVERSANT® kPCR Molecular System (Siemens Healthcare Diagnostics,Tarrytown, N.Y.; not commercially available in the USA). Briefly, 150 μlFFPE buffer (Buffer FFPE, research reagent, Siemens HealthcareDiagnostics) were added to each section and incubated for 30 minutes at80° C. with shaking to melt the paraffin. After cooling down, proteinaseK was added and incubated for 30 minutes at 65° C. After lysis, residualtissue debris was removed from the lysis fluid by a 15 minutesincubation step at 65° C. with 40 μl silica-coated iron oxide beads. Thebeads with surface-bound tissue debris were separated with a magnet andthe lysates were transferred to a standard 2 ml deep well-plate (96wells). There, the total RNA and DNA was bound to 40 μl unused beads andincubated at room temperature. Chaotropic conditions were produced bythe addition of 600 μl lysis buffer. Then, the beads were magneticallyseparated and the supernatants were discarded. Afterwards, thesurface-bound nucleic acids were washed three times followed bymagnetization, aspiration and disposal of supernatants. Afterwards, thenucleic acids were eluted by incubation of the beads with 100 μl elutionbuffer for 10 minutes at 70° C. with shaking. Finally, the beads wereseparated and the supernatant incubated with 12 μl DNase I Mix (2 μLDNase I (RNase free); 10 μl 10× DNase I buffer; Ambi-on/AppliedBiosystems, Darmstadt, Germany) to remove contaminating DNA. Afterincubation for 30 minutes at 37° C., the DNA-free total RNA solution wasaliquoted and stored at −80° C. or directly used for mRNA expressionanalysis by reverse transcription kinetic PCR (RTkPCR). All the sampleswere analyzed with one-step RT-kPCR in an ABI PRISM® 7900HT (AppliedBiosystems, Darmstadt, Germany). The SuperScript® III Platinum® One-StepQuantitative RT-PCR System with ROX (6-carboxy-X-rhodamine) (Invitrogen,Karlsruhe, Germany) was used according to the manufacturer'sinstructions. Respective probes and primers are described previously (EP2 553 118 B1). The PCR conditions were as follows: 30 minutes at 50° C.,2 minutes at 95° C. followed by 40 cycles of 15 seconds at 95° C. and 30seconds at 60° C. All the PCR assays were performed in triplicate.

Following extraction of RNA and assessment of mRNA levels of the 8 EPgenes-of-interest BIRC5, UBE2C, DHCR7, RBBP8, IL6ST, AZGP1, MGP, andSTC2, as well as the three reference genes RPL37A, CALM2, and OAZ1 byRT-PCR, alternative algorithms were applied that lacked normalization ofall eight EP genes or different subsets of EP genes. The first step inthe calculation of the EP score was the determination of delta-Ctvalues. The following definition was used:

Δ_(i)=20−x _(i) +r  (1)

In this equation, Δ_(i) is the delta-Ct value of the “gene of interest”i, x_(i) is the Ct value of gene i, and r is the average of the Ctvalues of the three reference genes as described herein

In the second step, the eight delta-Ct values are calculated into onescore.

$\begin{matrix}{{EP} = {\sum\limits_{i = 1}^{8}{c_{i}\Delta_{i}}}} & (2)\end{matrix}$

Herein, EP is the (unscaled) EP score, and c_(i) is the linearcoefficient for the informative gene i. The linear coefficients werethose used as published by Filipits (2011).

In order to calculate EP directly from the Ct values x_(i), equation (1)was substituted into equation (2) to obtain equation (3).

$\begin{matrix}{{EP} = {\sum\limits_{i = 1}^{8}{c_{i}\left( {20 - x_{i} + r} \right)}}} & (3)\end{matrix}$

Ct values of the informative genes x₁, . . . , x₈ were then separatedfrom the average of the Ct values of the reference genes r by factoring:

$\begin{matrix}{{EP} = {{\sum\limits_{i = 1}^{8}{{- c_{i}}x_{i}}} + {\left( {20 + r} \right){\sum\limits_{i = 1}^{8}c_{i}}}}} & (4)\end{matrix}$

The second factor in the second addend was then calculated using thelinear coefficients.

$\begin{matrix}{{\sum\limits_{i = 1}^{8}c_{i}} = {- 0.056567}} & (5)\end{matrix}$

Thus, the absolute value of this sum was relatively small, thus allowingapproximation of a new EP score:

$\begin{matrix}{= {{\sum\limits_{i = 1}^{8}{{- c_{i}}x_{i}}} + {\left( {20 + \overset{\sim}{r}} \right){\sum\limits_{i = 1}^{8}c_{i}}}}} & (6)\end{matrix}$

Here, only two variables were replaced as compared to equation (4):

designates the new approximated EP score, and r designates a constant,which designates a constant equaling 23 as described in thespecification herein. In particular, r (unlike r) is not dependent onmeasured values of the patient sample in question.

Now, after the definition of the approximated EP score according toequation (6), the difference between the new EP score and the previousEP score was obtained by subtracting equations (6) and (4) to giveequation (7).

$\begin{matrix}{= {{EP} = {{\left( {\overset{\_}{r} - r} \right){\sum\limits_{i = 1}^{8}c_{i}}} = {{- 0.056567} \cdot \left( {\overset{\_}{r} - r} \right)}}}} & (7)\end{matrix}$

On the basis of this equation, it was clear that the alteration of thescore, i.e.,

−EP, can be kept small if the constant r is selected to be close to r.

Because of the small value of the sum over c_(i), an approximation forthe calculation of the EP score was obtained according to equation (6)with r=23:

$\begin{matrix}{\overset{\_}{EP} = {{\sum\limits_{i = 1}^{8}{{- c_{i}}x_{i}}} - 2.432381}} & (8)\end{matrix}$

It was also possible to calculate part of an approximate EP score fromthe normalized value of the RNA expression of some genes by analogy withequation (3), and to calculate some other part of the EP score from theunnormalized RNA expression levels of the remaining genes by analogywith equation (6) according to equation (9):

$\begin{matrix}{= {{\sum\limits_{i = 1}^{k}{{- c_{i}}x_{i}}} + {\left( {20 + \overset{\sim}{r}} \right){\sum\limits_{i = 1}^{k}c_{i}}} + {\sum\limits_{i = {k + 1}}^{8}{{- c_{i}}x_{i}}} + {\left( {20 + r} \right){\sum\limits_{i = {k + 1}}^{8}c_{i}}}}} & (9)\end{matrix}$

wherein k must be a natural number from 1 to 6. Thus, suitable genecombinations that can be included in the modified EP score withoutnormalization are, for example, BIRC5, AZGP1, STC2 (sum over c_(i)equals −0.003) (FIG. 1) or BIRC5 and IL6ST and STC2 (sum over c_(i)equals −0.043956) (FIG. 2) or IL6ST and DHCR7 and STC2 (sum over c_(i)equals −0.05769) (FIG. 3). The respectively remaining genes of the EPscore would then be included in the modified EP score in an individuallynormalized form. FIG. 4 demonstrates the lack of normalization of alleight EP genes.

1. A method for predicting a result relating to breast cancer in anestrogen receptor-positive and HER2-negative tumor in a breast cancerpatient, the method comprising: (a) determining the RNA expressionlevels of at least 4 of the following 8 genes in a tumor sample from thepatient: UBE2C, BIRC5, DHCR7, STC2, AZGP1, RBBP8, IL6ST and MGP; (b)mathematically combining the expression level values for the genes ofthe mentioned set, the values having been determined in the tumorsample, to obtain a combined score, the combined score indicating aprognosis for the patient, wherein the RNA expression level values haveat least in part not been normalized before the mathematicalcombination.
 2. The method according to claim 1, wherein the at least 4genes are BIRC5, UBE2C, RBBP8, and IL6ST.
 3. The method according toclaim 1, wherein the at least 4 genes are any of the panels described inTable
 1. 4. The method according to any of claims 1 to 3, wherein saidmathematically combining the expression levels is effected by using theformula $= {{\sum\limits_{i = 1}^{8}{{- c_{i}}x_{i}}} - 2.432381}$ or$= {{\sum\limits_{i = 1}^{k}{{- c_{i}}x_{i}}} + {\left( {20 + \overset{\sim}{r}} \right){\sum\limits_{i = 1}^{k}c_{i}}} + {\sum\limits_{i = {k + 1}}^{8}{{- c_{i}}x_{i}}} + {\left( {20 + r} \right){\sum\limits_{i = {k + 1}}^{8}{c_{i}.}}}}$5. The method according to any one of claims 1 to 4, wherein saidpatient has received endocrine therapy or is contemplated to receiveendocrine treatment.
 6. The method according to any one of claims 1 to 5wherein a risk of developing breast cancer recurrence or cancer-relateddeath is predicted.
 7. Method according to any of claims 1 to 6, whereinsaid expression level is determined as a Messenger-RNA expression level.8. Method according to claim 7, wherein said expression level isdetermined by at least one of a PCR based method, a microarray basedmethod, and a hybridization based method.
 9. Method of any one of thepreceding claims, wherein said determination of expression levels is ina formalin-fixed paraffin embedded tumor sample or in a fresh-frozentumor sample.
 10. Method of any one of the preceding claims, whereinone, two or more thresholds are determined for said combined score, thatdiscriminate into high and low risk, high, intermediate and low risk, ormore risk groups by applying the threshold on the combined score. 11.Method of any of claims 1 to 10, wherein a high combined score isindicative of benefit from cytotoxic chemotherapy.
 12. Method of any oneof the preceding claims, wherein information regarding nodal status ofthe patient is processed in the step of mathematically combiningexpression level values for the genes to yield a combined score. 13.Method of claim 12, wherein said information regarding nodal status is anumerical value if said nodal status is negative and said information isa different numerical value if said nodal status positive and adifferent or identical number if said nodal status is unknown.
 14. A kitfor performing a method of any of claims 1 to 13, said kit comprising aset of oligonucleotides capable of specifically binding sequences or tosequences of fragments of the genes in a combination of genes, whereinsaid combination comprises determining the RNA expression levels of atleast 4 of the following 8 genes in a tumor sample from the patient:UBE2C, BIRC5, DHCR7, STC2, AZGP1, RBBP8, IL6ST and MGP.
 15. The kitaccording to claim 14, wherein the at least 4 genes are BIRC5, UBE2C,RBBP8, and IL6ST.
 16. The kit according to claim 14, wherein the atleast 4 genes are any of the panels described in Table
 1. 17. A computerprogram product capable of processing values representative ofexpression levels of a set of genes, mathematically combining saidvalues to yield a combined score, wherein said combined score isindicative of efficacy from endocrine therapy of said patient, accordingto the methods of any of claims 1 to 13.